A Comparative Evaluation of Feature Set Evolution Strategies for Multirelational Boosting
نویسندگان
چکیده
Boosting has established itself as a successful technique for decreasing the generalization error of classification learners by basing predictions on ensembles of hypotheses. While previous research has shown that this technique can be made to work efficiently even in the context of multirelational learning by using simple learners and active feature selection, such approaches have relied on simple and static methods of determining feature selection ordering a priori and adding features only in a forward manner. In this paper, we investigate whether the distributional information present in boosting can usefully be exploited in the course of learning to reweight features and in fact even to dynamically adapt the feature set by adding the currently most relevant features and removing those that are no longer needed. Preliminary results show that these more informed feature set evolution strategies surprisingly have mixed effects on the number of features ultimately used in the ensemble, and on the resulting classification accuracy.
منابع مشابه
Towards feature selection for disk-based multirelational learners: a case study with a boosting algorithm
Feature selection is an important issue for any learning algorithm, since reduced feature sets lead to an improvement in learning time, reduced model complexity and, in many cases, a reduced risk of overfitting. When performing feature selection for RAM-based learning algorithms, we typically assume that the cost of accessing each feature is uniform. In multirelational data mining, especially w...
متن کاملKnowledge Discovery and Data Mining (KDD-2003)
Feature selection is an important issue for any learning algorithm, since reduced feature sets lead to an improvement in learning time, reduced model complexity and, in many cases, a reduced risk of overfitting. When performing feature selection for RAM-based learning algorithms, we typically assume that the cost of accessing each feature is uniform. In multirelational data mining, especially w...
متن کاملMachine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملEfficacy of Symmetrical and Asymmetrical Pushed Negotiations in Boosting Speaking
This study was set out to shed light on the efficacy of pushed output directed by scaffolding on 41 (24 female and 17 male) upper-intermediate EFL learners’ speaking fluency and accuracy. A public version of IELTS speaking test was held to measure learners’ entrance behavior. Then, they were randomly assigned into symmetrical, asymmetrical, and control group. The experimental and control groups...
متن کاملEvaluating the Growth and Evolution of Facility Management in Innovating Integrating and Aligning Business Strategies to Achieve a Competitive Advantage
The South African Facilities Management (FM) industry has seen increased operational strategy complexity from single-site contractors providing basic janitorial services to highly integrated and bundled FM service providers. Despite these major changes, very little research has been conducted on evaluating the effectiveness of FM in innovating, integrating and aligning business strategies to a...
متن کامل